Overview

Dataset statistics

Number of variables28
Number of observations21464
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.1 MiB
Average record size in memory152.0 B

Variable types

Numeric7
Categorical21

Warnings

Customer ID has a high cardinality: 21464 distinct values High cardinality
Income (USD) is highly correlated with Property AgeHigh correlation
Loan Amount Request (USD) is highly correlated with Current Loan Expenses (USD) and 1 other fieldsHigh correlation
Current Loan Expenses (USD) is highly correlated with Loan Amount Request (USD) and 1 other fieldsHigh correlation
Property Age is highly correlated with Income (USD)High correlation
Property Price is highly correlated with Loan Amount Request (USD) and 1 other fieldsHigh correlation
Type of Employment_missing_val is highly correlated with Profession_PensionerHigh correlation
Profession_Working is highly correlated with Profession_Commercial associateHigh correlation
Profession_Commercial associate is highly correlated with Profession_WorkingHigh correlation
Profession_Pensioner is highly correlated with Type of Employment_missing_valHigh correlation
Location_Semi-Urban is highly correlated with Location_RuralHigh correlation
Location_Rural is highly correlated with Location_Semi-UrbanHigh correlation
Income (USD) is highly correlated with Property AgeHigh correlation
Loan Amount Request (USD) is highly correlated with Current Loan Expenses (USD) and 1 other fieldsHigh correlation
Current Loan Expenses (USD) is highly correlated with Loan Amount Request (USD) and 1 other fieldsHigh correlation
Property Age is highly correlated with Income (USD)High correlation
Property Price is highly correlated with Loan Amount Request (USD) and 1 other fieldsHigh correlation
Type of Employment_missing_val is highly correlated with Profession_PensionerHigh correlation
Profession_Working is highly correlated with Profession_Commercial associateHigh correlation
Profession_Commercial associate is highly correlated with Profession_WorkingHigh correlation
Profession_Pensioner is highly correlated with Type of Employment_missing_valHigh correlation
Location_Semi-Urban is highly correlated with Location_RuralHigh correlation
Location_Rural is highly correlated with Location_Semi-UrbanHigh correlation
Income (USD) is highly correlated with Property AgeHigh correlation
Loan Amount Request (USD) is highly correlated with Current Loan Expenses (USD) and 1 other fieldsHigh correlation
Current Loan Expenses (USD) is highly correlated with Loan Amount Request (USD) and 1 other fieldsHigh correlation
Property Age is highly correlated with Income (USD)High correlation
Property Price is highly correlated with Loan Amount Request (USD) and 1 other fieldsHigh correlation
Type of Employment_missing_val is highly correlated with Profession_PensionerHigh correlation
Profession_Working is highly correlated with Profession_Commercial associateHigh correlation
Profession_Commercial associate is highly correlated with Profession_WorkingHigh correlation
Profession_Pensioner is highly correlated with Type of Employment_missing_valHigh correlation
Location_Semi-Urban is highly correlated with Location_RuralHigh correlation
Location_Rural is highly correlated with Location_Semi-UrbanHigh correlation
Current Loan Expenses (USD) is highly correlated with Property Price and 1 other fieldsHigh correlation
Profession_Working is highly correlated with Profession_Commercial associate and 1 other fieldsHigh correlation
Income (USD) is highly correlated with Property AgeHigh correlation
Location_Semi-Urban is highly correlated with Location_RuralHigh correlation
Property Price is highly correlated with Current Loan Expenses (USD) and 1 other fieldsHigh correlation
Dependents_1.0 is highly correlated with Dependents_2.0High correlation
Property Age is highly correlated with Income (USD)High correlation
Type of Employment_missing_val is highly correlated with Profession_PensionerHigh correlation
Profession_Commercial associate is highly correlated with Profession_WorkingHigh correlation
Profession_Pensioner is highly correlated with Profession_Working and 1 other fieldsHigh correlation
Dependents_2.0 is highly correlated with Dependents_1.0 and 1 other fieldsHigh correlation
Dependents_3.0 is highly correlated with Dependents_2.0High correlation
Location_Rural is highly correlated with Location_Semi-UrbanHigh correlation
Loan Amount Request (USD) is highly correlated with Current Loan Expenses (USD) and 1 other fieldsHigh correlation
Profession_Working is highly correlated with Profession_Commercial associateHigh correlation
Profession_Commercial associate is highly correlated with Profession_WorkingHigh correlation
Location_Semi-Urban is highly correlated with Location_RuralHigh correlation
Type of Employment_missing_val is highly correlated with Profession_PensionerHigh correlation
Profession_Pensioner is highly correlated with Type of Employment_missing_valHigh correlation
Location_Rural is highly correlated with Location_Semi-UrbanHigh correlation
df_index is uniformly distributed Uniform
Customer ID is uniformly distributed Uniform
df_index has unique values Unique
Customer ID has unique values Unique

Reproduction

Analysis started2021-06-27 06:21:29.771594
Analysis finished2021-06-27 06:22:09.618969
Duration39.85 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct21464
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15039.55814
Minimum0
Maximum29999
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size167.8 KiB
2021-06-27T11:52:09.840730image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1501.45
Q17577.5
median15097.5
Q322477.5
95-th percentile28497.85
Maximum29999
Range29999
Interquartile range (IQR)14900

Descriptive statistics

Standard deviation8635.922902
Coefficient of variation (CV)0.5742138711
Kurtosis-1.191862581
Mean15039.55814
Median Absolute Deviation (MAD)7445
Skewness-0.00767563033
Sum322809076
Variance74579164.37
MonotonicityNot monotonic
2021-06-27T11:52:10.547925image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20471
 
< 0.1%
190521
 
< 0.1%
231501
 
< 0.1%
211031
 
< 0.1%
108641
 
< 0.1%
6291
 
< 0.1%
47271
 
< 0.1%
272561
 
< 0.1%
293071
 
< 0.1%
190681
 
< 0.1%
Other values (21454)21454
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
81
< 0.1%
91
< 0.1%
101
< 0.1%
111
< 0.1%
ValueCountFrequency (%)
299991
< 0.1%
299981
< 0.1%
299961
< 0.1%
299941
< 0.1%
299931
< 0.1%
299921
< 0.1%
299911
< 0.1%
299901
< 0.1%
299891
< 0.1%
299881
< 0.1%

Customer ID
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct21464
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
C-31712
 
1
C-9550
 
1
C-32444
 
1
C-37121
 
1
C-17283
 
1
Other values (21459)
21459 

Length

Max length7
Median length7
Mean length6.780609392
Min length3

Characters and Unicode

Total characters145539
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21464 ?
Unique (%)100.0%

Sample

1st rowC-25431
2nd rowC-30796
3rd rowC-41817
4th rowC-20088
5th rowC-24848

Common Values

ValueCountFrequency (%)
C-317121
 
< 0.1%
C-95501
 
< 0.1%
C-324441
 
< 0.1%
C-371211
 
< 0.1%
C-172831
 
< 0.1%
C-488361
 
< 0.1%
C-75811
 
< 0.1%
C-180191
 
< 0.1%
C-485881
 
< 0.1%
C-465821
 
< 0.1%
Other values (21454)21454
> 99.9%

Length

2021-06-27T11:52:11.273640image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c-68721
 
< 0.1%
c-282931
 
< 0.1%
c-168881
 
< 0.1%
c-48631
 
< 0.1%
c-233171
 
< 0.1%
c-283831
 
< 0.1%
c-250411
 
< 0.1%
c-8991
 
< 0.1%
c-257711
 
< 0.1%
c-94151
 
< 0.1%
Other values (21454)21454
> 99.9%

Most occurring characters

ValueCountFrequency (%)
C21464
14.7%
-21464
14.7%
312994
8.9%
412979
8.9%
112903
8.9%
212799
8.8%
78604
5.9%
88596
5.9%
98563
 
5.9%
58522
 
5.9%
Other values (2)16651
11.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number102611
70.5%
Uppercase Letter21464
 
14.7%
Dash Punctuation21464
 
14.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
312994
12.7%
412979
12.6%
112903
12.6%
212799
12.5%
78604
8.4%
88596
8.4%
98563
8.3%
58522
8.3%
68453
8.2%
08198
8.0%
Uppercase Letter
ValueCountFrequency (%)
C21464
100.0%
Dash Punctuation
ValueCountFrequency (%)
-21464
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common124075
85.3%
Latin21464
 
14.7%

Most frequent character per script

Common
ValueCountFrequency (%)
-21464
17.3%
312994
10.5%
412979
10.5%
112903
10.4%
212799
10.3%
78604
6.9%
88596
6.9%
98563
 
6.9%
58522
 
6.9%
68453
 
6.8%
Latin
ValueCountFrequency (%)
C21464
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII145539
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C21464
14.7%
-21464
14.7%
312994
8.9%
412979
8.9%
112903
8.9%
212799
8.8%
78604
5.9%
88596
5.9%
98563
 
5.9%
58522
 
5.9%
Other values (2)16651
11.4%

Income (USD)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct17467
Distinct (%)81.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2238.407781
Minimum378.76
Maximum4497.57
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size167.8 KiB
2021-06-27T11:52:11.624579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum378.76
5-th percentile1051.15
Q11691.32
median2222.435
Q32650.6725
95-th percentile3806.101
Maximum4497.57
Range4118.81
Interquartile range (IQR)959.3525

Descriptive statistics

Standard deviation784.0411152
Coefficient of variation (CV)0.3502673292
Kurtosis0.1118915527
Mean2238.407781
Median Absolute Deviation (MAD)496.515
Skewness0.5650171066
Sum48045184.62
Variance614720.4702
MonotonicityNot monotonic
2021-06-27T11:52:11.895665image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2222.4353397
 
15.8%
2444.683
 
< 0.1%
2471.783
 
< 0.1%
1060.653
 
< 0.1%
1906.593
 
< 0.1%
2442.493
 
< 0.1%
1818.153
 
< 0.1%
1273.083
 
< 0.1%
2114.523
 
< 0.1%
1617.973
 
< 0.1%
Other values (17457)18040
84.0%
ValueCountFrequency (%)
378.761
< 0.1%
393.091
< 0.1%
418.91
< 0.1%
437.631
< 0.1%
438.441
< 0.1%
442.471
< 0.1%
450.161
< 0.1%
472.041
< 0.1%
487.811
< 0.1%
495.71
< 0.1%
ValueCountFrequency (%)
4497.571
< 0.1%
4496.931
< 0.1%
4496.441
< 0.1%
4495.611
< 0.1%
4493.531
< 0.1%
4493.31
< 0.1%
4490.841
< 0.1%
4489.911
< 0.1%
4489.731
< 0.1%
4486.921
< 0.1%

Loan Amount Request (USD)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct21456
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean83241.6175
Minimum6048.24
Maximum361883.74
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size167.8 KiB
2021-06-27T11:52:12.192759image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum6048.24
5-th percentile20560.837
Q139966.66
median71460.095
Q3113400.31
95-th percentile189614.8265
Maximum361883.74
Range355835.5
Interquartile range (IQR)73433.65

Descriptive statistics

Standard deviation54013.27145
Coefficient of variation (CV)0.648873401
Kurtosis1.088499815
Mean83241.6175
Median Absolute Deviation (MAD)33884.185
Skewness1.09971624
Sum1786698078
Variance2917433493
MonotonicityNot monotonic
2021-06-27T11:52:12.493608image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24892.322
 
< 0.1%
81346.592
 
< 0.1%
42718.742
 
< 0.1%
71723.642
 
< 0.1%
73908.662
 
< 0.1%
54256.332
 
< 0.1%
30810.812
 
< 0.1%
43921.712
 
< 0.1%
99765.341
 
< 0.1%
33556.071
 
< 0.1%
Other values (21446)21446
99.9%
ValueCountFrequency (%)
6048.241
< 0.1%
6108.051
< 0.1%
6145.011
< 0.1%
6174.71
< 0.1%
6189.51
< 0.1%
6307.151
< 0.1%
6310.261
< 0.1%
6341.021
< 0.1%
6431.371
< 0.1%
6436.141
< 0.1%
ValueCountFrequency (%)
361883.741
< 0.1%
351190.521
< 0.1%
350588.11
< 0.1%
349610.151
< 0.1%
346164.461
< 0.1%
343465.741
< 0.1%
336983.761
< 0.1%
336792.371
< 0.1%
335195.631
< 0.1%
333298.191
< 0.1%

Current Loan Expenses (USD)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct18065
Distinct (%)84.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean382.1806378
Minimum33.76
Maximum927.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size167.8 KiB
2021-06-27T11:52:12.857057image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum33.76
5-th percentile128.419
Q1241.445
median362.63
Q3491.6425
95-th percentile731.6885
Maximum927.02
Range893.26
Interquartile range (IQR)250.1975

Descriptive statistics

Standard deviation181.0748457
Coefficient of variation (CV)0.473793876
Kurtosis-0.1708821678
Mean382.1806378
Median Absolute Deviation (MAD)124.52
Skewness0.5893548918
Sum8203125.21
Variance32788.09976
MonotonicityNot monotonic
2021-06-27T11:52:13.128760image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
375.205124
 
0.6%
333.795
 
< 0.1%
366.375
 
< 0.1%
223.384
 
< 0.1%
469.014
 
< 0.1%
429.454
 
< 0.1%
233.954
 
< 0.1%
430.664
 
< 0.1%
323.124
 
< 0.1%
230.764
 
< 0.1%
Other values (18055)21302
99.2%
ValueCountFrequency (%)
33.761
< 0.1%
34.041
< 0.1%
42.131
< 0.1%
43.091
< 0.1%
44.231
< 0.1%
44.631
< 0.1%
47.781
< 0.1%
48.21
< 0.1%
48.231
< 0.1%
48.61
< 0.1%
ValueCountFrequency (%)
927.021
< 0.1%
926.831
< 0.1%
926.682
< 0.1%
926.651
< 0.1%
926.251
< 0.1%
925.51
< 0.1%
925.141
< 0.1%
924.571
< 0.1%
923.651
< 0.1%
923.571
< 0.1%

Credit Score
Real number (ℝ≥0)

Distinct14245
Distinct (%)66.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean738.0184775
Minimum580
Maximum896.26
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size167.8 KiB
2021-06-27T11:52:13.411133image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum580
5-th percentile624.6415
Q1682.745
median739.82
Q3792.88
95-th percentile852.6485
Maximum896.26
Range316.26
Interquartile range (IQR)110.135

Descriptive statistics

Standard deviation70.22670608
Coefficient of variation (CV)0.09515575588
Kurtosis-0.874490693
Mean738.0184775
Median Absolute Deviation (MAD)55.345
Skewness0.001901974457
Sum15840828.6
Variance4931.790247
MonotonicityNot monotonic
2021-06-27T11:52:13.693474image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
739.821205
 
5.6%
777.867
 
< 0.1%
762.036
 
< 0.1%
727.486
 
< 0.1%
667.266
 
< 0.1%
792.766
 
< 0.1%
789.286
 
< 0.1%
772.415
 
< 0.1%
843.035
 
< 0.1%
731.815
 
< 0.1%
Other values (14235)20207
94.1%
ValueCountFrequency (%)
5801
< 0.1%
580.851
< 0.1%
581.661
< 0.1%
582.31
< 0.1%
582.621
< 0.1%
583.051
< 0.1%
583.421
< 0.1%
583.541
< 0.1%
583.551
< 0.1%
583.571
< 0.1%
ValueCountFrequency (%)
896.261
< 0.1%
891.951
< 0.1%
890.021
< 0.1%
889.791
< 0.1%
889.721
< 0.1%
889.241
< 0.1%
889.031
< 0.1%
888.771
< 0.1%
888.762
< 0.1%
888.551
< 0.1%

No. of Defaults
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
17321 
1
4143 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
017321
80.7%
14143
 
19.3%

Length

2021-06-27T11:52:14.281684image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:14.481906image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
017321
80.7%
14143
 
19.3%

Most occurring characters

ValueCountFrequency (%)
017321
80.7%
14143
 
19.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
017321
80.7%
14143
 
19.3%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
017321
80.7%
14143
 
19.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
017321
80.7%
14143
 
19.3%

Property Age
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct17300
Distinct (%)80.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2238.629857
Minimum378.76
Maximum4497.57
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size167.8 KiB
2021-06-27T11:52:14.781090image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum378.76
5-th percentile1053.3005
Q11696.68
median2223.25
Q32641.065
95-th percentile3800.7575
Maximum4497.57
Range4118.81
Interquartile range (IQR)944.385

Descriptive statistics

Standard deviation780.2187961
Coefficient of variation (CV)0.3485251453
Kurtosis0.14084375
Mean2238.629857
Median Absolute Deviation (MAD)488.835
Skewness0.5679545199
Sum48049951.24
Variance608741.3698
MonotonicityNot monotonic
2021-06-27T11:52:15.112755image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2223.253579
 
16.7%
1868.783
 
< 0.1%
2312.373
 
< 0.1%
1906.593
 
< 0.1%
2415.73
 
< 0.1%
2471.783
 
< 0.1%
2444.683
 
< 0.1%
2117.343
 
< 0.1%
1617.973
 
< 0.1%
1519.53
 
< 0.1%
Other values (17290)17858
83.2%
ValueCountFrequency (%)
378.761
< 0.1%
393.091
< 0.1%
418.91
< 0.1%
437.631
< 0.1%
438.441
< 0.1%
442.471
< 0.1%
450.161
< 0.1%
472.041
< 0.1%
487.811
< 0.1%
495.71
< 0.1%
ValueCountFrequency (%)
4497.571
< 0.1%
4496.931
< 0.1%
4496.441
< 0.1%
4495.611
< 0.1%
4493.531
< 0.1%
4493.31
< 0.1%
4490.841
< 0.1%
4489.911
< 0.1%
4489.731
< 0.1%
4486.921
< 0.1%

Co-Applicant
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
1.0
18336 
0.0
3128 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters64392
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.018336
85.4%
0.03128
 
14.6%

Length

2021-06-27T11:52:15.914571image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:16.128896image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1.018336
85.4%
0.03128
 
14.6%

Most occurring characters

ValueCountFrequency (%)
024592
38.2%
.21464
33.3%
118336
28.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number42928
66.7%
Other Punctuation21464
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
024592
57.3%
118336
42.7%
Other Punctuation
ValueCountFrequency (%)
.21464
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common64392
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
024592
38.2%
.21464
33.3%
118336
28.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII64392
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
024592
38.2%
.21464
33.3%
118336
28.5%

Property Price
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct21216
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean123415.822
Minimum-999
Maximum625889.29
Zeros0
Zeros (%)0.0%
Negative244
Negative (%)1.1%
Memory size167.8 KiB
2021-06-27T11:52:16.351256image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-999
5-th percentile27471.534
Q158524.18
median103827.865
Q3166848.4225
95-th percentile291140.467
Maximum625889.29
Range626888.29
Interquartile range (IQR)108324.2425

Descriptive statistics

Standard deviation84851.19201
Coefficient of variation (CV)0.6875228043
Kurtosis1.735610759
Mean123415.822
Median Absolute Deviation (MAD)50678.825
Skewness1.232984052
Sum2648997203
Variance7199724786
MonotonicityNot monotonic
2021-06-27T11:52:16.677006image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-999244
 
1.1%
57914.082
 
< 0.1%
356356.482
 
< 0.1%
53075.162
 
< 0.1%
279299.822
 
< 0.1%
51395.672
 
< 0.1%
85370.861
 
< 0.1%
192978.671
 
< 0.1%
150370.681
 
< 0.1%
245940.941
 
< 0.1%
Other values (21206)21206
98.8%
ValueCountFrequency (%)
-999244
1.1%
7265.951
 
< 0.1%
7309.871
 
< 0.1%
7439.121
 
< 0.1%
7859.621
 
< 0.1%
7900.341
 
< 0.1%
8012.241
 
< 0.1%
8029.941
 
< 0.1%
8231.391
 
< 0.1%
8290.921
 
< 0.1%
ValueCountFrequency (%)
625889.291
< 0.1%
622793.921
< 0.1%
596147.381
< 0.1%
558179.181
< 0.1%
556390.541
< 0.1%
555225.11
< 0.1%
554514.941
< 0.1%
550572.941
< 0.1%
549754.861
< 0.1%
544450.81
< 0.1%

Type of Employment_missing_val
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
16184 
1
5280 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
016184
75.4%
15280
 
24.6%

Length

2021-06-27T11:52:17.344856image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:17.515418image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
016184
75.4%
15280
 
24.6%

Most occurring characters

ValueCountFrequency (%)
016184
75.4%
15280
 
24.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
016184
75.4%
15280
 
24.6%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
016184
75.4%
15280
 
24.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
016184
75.4%
15280
 
24.6%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
17315 
1
4149 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
017315
80.7%
14149
 
19.3%

Length

2021-06-27T11:52:18.029720image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:18.203474image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
017315
80.7%
14149
 
19.3%

Most occurring characters

ValueCountFrequency (%)
017315
80.7%
14149
 
19.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
017315
80.7%
14149
 
19.3%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
017315
80.7%
14149
 
19.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
017315
80.7%
14149
 
19.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
18678 
1
2786 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
018678
87.0%
12786
 
13.0%

Length

2021-06-27T11:52:18.719770image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:18.921705image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
018678
87.0%
12786
 
13.0%

Most occurring characters

ValueCountFrequency (%)
018678
87.0%
12786
 
13.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
018678
87.0%
12786
 
13.0%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
018678
87.0%
12786
 
13.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
018678
87.0%
12786
 
13.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
19141 
1
2323 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
019141
89.2%
12323
 
10.8%

Length

2021-06-27T11:52:19.424751image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:19.618303image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
019141
89.2%
12323
 
10.8%

Most occurring characters

ValueCountFrequency (%)
019141
89.2%
12323
 
10.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
019141
89.2%
12323
 
10.8%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
019141
89.2%
12323
 
10.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
019141
89.2%
12323
 
10.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
20065 
1
 
1399

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
020065
93.5%
11399
 
6.5%

Length

2021-06-27T11:52:20.144145image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:20.337752image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
020065
93.5%
11399
 
6.5%

Most occurring characters

ValueCountFrequency (%)
020065
93.5%
11399
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
020065
93.5%
11399
 
6.5%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
020065
93.5%
11399
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
020065
93.5%
11399
 
6.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
20320 
1
 
1144

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
020320
94.7%
11144
 
5.3%

Length

2021-06-27T11:52:20.899251image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:21.062571image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
020320
94.7%
11144
 
5.3%

Most occurring characters

ValueCountFrequency (%)
020320
94.7%
11144
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
020320
94.7%
11144
 
5.3%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
020320
94.7%
11144
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
020320
94.7%
11144
 
5.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
20514 
1
 
950

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
020514
95.6%
1950
 
4.4%

Length

2021-06-27T11:52:21.596039image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:21.787667image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
020514
95.6%
1950
 
4.4%

Most occurring characters

ValueCountFrequency (%)
020514
95.6%
1950
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
020514
95.6%
1950
 
4.4%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
020514
95.6%
1950
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
020514
95.6%
1950
 
4.4%

Expense Type 1_N
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
1
14107 
0
7357 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
114107
65.7%
07357
34.3%

Length

2021-06-27T11:52:22.253605image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:22.435180image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
114107
65.7%
07357
34.3%

Most occurring characters

ValueCountFrequency (%)
114107
65.7%
07357
34.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
114107
65.7%
07357
34.3%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
114107
65.7%
07357
34.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
114107
65.7%
07357
34.3%

Profession_Working
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
1
12416 
0
9048 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
112416
57.8%
09048
42.2%

Length

2021-06-27T11:52:22.936060image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:23.108149image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
112416
57.8%
09048
42.2%

Most occurring characters

ValueCountFrequency (%)
112416
57.8%
09048
42.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
112416
57.8%
09048
42.2%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
112416
57.8%
09048
42.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
112416
57.8%
09048
42.2%

Profession_Commercial associate
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
16109 
1
5355 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
016109
75.1%
15355
 
24.9%

Length

2021-06-27T11:52:23.605836image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:23.767223image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
016109
75.1%
15355
 
24.9%

Most occurring characters

ValueCountFrequency (%)
016109
75.1%
15355
 
24.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
016109
75.1%
15355
 
24.9%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
016109
75.1%
15355
 
24.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
016109
75.1%
15355
 
24.9%

Profession_Pensioner
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
19414 
1
2050 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
019414
90.4%
12050
 
9.6%

Length

2021-06-27T11:52:24.346275image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:24.548830image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
019414
90.4%
12050
 
9.6%

Most occurring characters

ValueCountFrequency (%)
019414
90.4%
12050
 
9.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
019414
90.4%
12050
 
9.6%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
019414
90.4%
12050
 
9.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
019414
90.4%
12050
 
9.6%

Dependents_2.0
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
12144 
1
9320 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
012144
56.6%
19320
43.4%

Length

2021-06-27T11:52:25.093123image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:25.287431image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
012144
56.6%
19320
43.4%

Most occurring characters

ValueCountFrequency (%)
012144
56.6%
19320
43.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
012144
56.6%
19320
43.4%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
012144
56.6%
19320
43.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
012144
56.6%
19320
43.4%

Dependents_3.0
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
17359 
1
4105 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
017359
80.9%
14105
 
19.1%

Length

2021-06-27T11:52:26.422164image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:26.615724image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
017359
80.9%
14105
 
19.1%

Most occurring characters

ValueCountFrequency (%)
017359
80.9%
14105
 
19.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
017359
80.9%
14105
 
19.1%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
017359
80.9%
14105
 
19.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
017359
80.9%
14105
 
19.1%

Dependents_1.0
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
17423 
1
4041 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
017423
81.2%
14041
 
18.8%

Length

2021-06-27T11:52:27.127472image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:27.309233image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
017423
81.2%
14041
 
18.8%

Most occurring characters

ValueCountFrequency (%)
017423
81.2%
14041
 
18.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
017423
81.2%
14041
 
18.8%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
017423
81.2%
14041
 
18.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
017423
81.2%
14041
 
18.8%

Dependents_4.0
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
19527 
1
 
1937

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
019527
91.0%
11937
 
9.0%

Length

2021-06-27T11:52:27.823947image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:28.006139image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
019527
91.0%
11937
 
9.0%

Most occurring characters

ValueCountFrequency (%)
019527
91.0%
11937
 
9.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
019527
91.0%
11937
 
9.0%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
019527
91.0%
11937
 
9.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
019527
91.0%
11937
 
9.0%

Dependents_0
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
19711 
1
 
1753

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
019711
91.8%
11753
 
8.2%

Length

2021-06-27T11:52:28.865360image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:29.166675image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
019711
91.8%
11753
 
8.2%

Most occurring characters

ValueCountFrequency (%)
019711
91.8%
11753
 
8.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
019711
91.8%
11753
 
8.2%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
019711
91.8%
11753
 
8.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
019711
91.8%
11753
 
8.2%

Location_Semi-Urban
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
1
15622 
0
5842 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
115622
72.8%
05842
 
27.2%

Length

2021-06-27T11:52:29.797023image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:30.058836image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
115622
72.8%
05842
 
27.2%

Most occurring characters

ValueCountFrequency (%)
115622
72.8%
05842
 
27.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
115622
72.8%
05842
 
27.2%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
115622
72.8%
05842
 
27.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
115622
72.8%
05842
 
27.2%

Location_Rural
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size167.8 KiB
0
17486 
1
3978 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters21464
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
017486
81.5%
13978
 
18.5%

Length

2021-06-27T11:52:30.674480image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-27T11:52:30.869915image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
017486
81.5%
13978
 
18.5%

Most occurring characters

ValueCountFrequency (%)
017486
81.5%
13978
 
18.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number21464
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
017486
81.5%
13978
 
18.5%

Most occurring scripts

ValueCountFrequency (%)
Common21464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
017486
81.5%
13978
 
18.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII21464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
017486
81.5%
13978
 
18.5%

Interactions

2021-06-27T11:51:47.240541image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:47.705375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:48.102185image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:48.549268image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:48.907574image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:49.239923image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:49.525141image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:49.851653image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:50.167186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:50.451988image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:50.825323image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:51.150902image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:51.434241image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:51.685810image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:52.002280image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:52.442266image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:52.857870image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:53.537619image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:53.963810image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:54.309695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:54.612127image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:54.955294image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:55.238517image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:55.500556image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:55.813336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:56.087007image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:56.369397image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:56.629454image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:56.917599image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:57.235627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:57.516417image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:57.851042image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:58.145458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:58.457881image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:58.740563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:59.063103image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:59.327721image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:59.569602image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:51:59.872132image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:52:00.136204image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:52:00.410418image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:52:00.652717image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:52:00.933342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:52:01.248489image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:52:01.530647image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:52:01.873790image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:52:02.168519image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:52:02.492291image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-06-27T11:52:02.766347image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-06-27T11:52:31.126576image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-27T11:52:32.212277image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-27T11:52:33.377267image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-27T11:52:34.580878image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-06-27T11:52:35.691777image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-06-27T11:52:03.463904image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-06-27T11:52:08.700204image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexCustomer IDIncome (USD)Loan Amount Request (USD)Current Loan Expenses (USD)Credit ScoreNo. of DefaultsProperty AgeCo-ApplicantProperty PriceType of Employment_missing_valType of Employment_LaborersType of Employment_Sales staffType of Employment_Core staffType of Employment_ManagersType of Employment_DriversType of Employment_AccountantsExpense Type 1_NProfession_WorkingProfession_Commercial associateProfession_PensionerDependents_2.0Dependents_3.0Dependents_1.0Dependents_4.0Dependents_0Location_Semi-UrbanLocation_Rural
05152C-254311731.99027709.83140.16739.8211731.991.052045.88000001000100000010
125363C-307962814.170188858.71510.05700.2402814.171.0268479.69000000110100010010
221026C-418172121.470134626.34444.14707.1402121.471.0233499.25010000001000010001
3844C-200882901.070240936.30670.18765.9002901.071.0386796.09100000010100000100
426146C-248482222.43525499.21278.08690.6212223.251.035278.10010000001000000110
514819C-335632069.290104248.66335.17735.6202069.291.0129824.27001000010101000010
62345C-256592222.435112172.96346.67768.1802223.251.0137905.05001000010100010010
75556C-149911717.69080456.84403.18697.7301717.691.0111191.58000000011000100001
818244C-329041515.54072297.39439.65751.9501515.541.0134809.06010000011000001010
910801C-247232222.435189895.78523.14757.7702223.251.0324744.34000000010000100001

Last rows

df_indexCustomer IDIncome (USD)Loan Amount Request (USD)Current Loan Expenses (USD)Credit ScoreNo. of DefaultsProperty AgeCo-ApplicantProperty PriceType of Employment_missing_valType of Employment_LaborersType of Employment_Sales staffType of Employment_Core staffType of Employment_ManagersType of Employment_DriversType of Employment_AccountantsExpense Type 1_NProfession_WorkingProfession_Commercial associateProfession_PensionerDependents_2.0Dependents_3.0Dependents_1.0Dependents_4.0Dependents_0Location_Semi-UrbanLocation_Rural
2145418872C-272611572.610126663.54372.82804.1201572.611.0145199.39010000011000100010
214557060C-48491639.660123796.56527.43739.8201639.661.0159519.12100000011001000001
2145624750C-279662639.47030304.63299.38741.6802639.471.035160.90010000011000000110
2145712663C-282512130.16070367.85241.25676.6702130.161.090880.92000010011000100010
2145813430C-134172222.435245591.32926.68816.8612223.251.0331291.05001000011001000010
2145924149C-143302222.435146440.71429.85766.7602223.251.0253150.73000000001001000000
214606067C-189491755.41077051.52282.34804.5801755.410.0115077.51100000010010000110
21461977C-51661711.33037989.79208.36739.8201711.330.044673.36100000010010000101
2146217711C-405261337.79050723.30374.06692.3701337.791.092324.46010000011000100010
2146326446C-314861616.85027012.91139.26652.5901616.851.048124.58100000000100100010